'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "https://calvin-data304.netlify.app/data/weather-with-dates.csv"
},
"height": 400,
"width": 500,
"title": "Mean Max Temperature Per Month",
"mark": "circle" ,
"encoding": {
"x": {"field": "month", "type": "temporal", "timeUnit": "month","title": "Month"},
"y": {"field": "temp_max", "type": "quantitative", "aggregate": "average", "title": "Temperature (F)"},
"color": {"field": "year", "title": "Year"},
"fillOpacity": {"value": 0.8},
"size": {"value": 80}
}
}' |> as_vegaspec()DATA 304 Homework 5
Exercise 1
Copy of Graphic:
Source: r/dataisbeautiful
What marks are being used? What variables are mapped to which properties?
The marks are lines. Specific “Worst Day” is mapped to color (Worst day -> color).
time after occurrence (years) -> x
percentage change -> y
Worst Day Occurrence -> color
What is the main story of this graphic?
The main story of the graphic says that from the three of the “worst days” for the stock market in history from 1987 to 2020, it took 2008’s Great Recession almost 4 years to recover from it’s state before the drop. 1987’s Black Monday took approximately 2 years and COVID-19’s crash recovered less than a year later.
What makes it a good graphic?
I really like that the x axis is not necessarily a datetime. It counts the number of years until recovery. I think that this makes the most sense to compare the recovery times because if we had a datetime on the x axis, it might be hard to compare since each worst day are multiple years apart. I also like that the colors are clear contrasts of each other.
What features do you think you would know how to implement in Vega-Lite?
I would create a line graph, but would have to layer each of the “worst days” and then define the cyan, pink, and yellow for each of the days. I would have to use “background”: “black” to change the background color. If the recovery times were in datetimes, I would have to calculate the difference between the first date and each specific row in the dataset so that on the x axis, I am able to layer each of the days on top of each other.
I would also have to add a layer of text for the lower right text that explains the graphs.
Are there any features of the graphic that you would not know how to do in Vega-Lite? If so, list them.
I am not sure how to add the “extra” lines on the x axis. I think that most of my graphics just use Vega-Lite’s default formatting. I also would have to learn how to cut off the extra lines so it does not exceed 0% on the y axis.
Exercise 2
Visualizing Weather 1
Create a graphic that shows the mean temperature for each month. How many “months” should you be displaying? (There is more than one answer to this – perhaps try doing it more that one way.)
Way 1
In this graphic, I am showing the mean max temperature per month, color is used as a guide to show the year for a specific month.
Way 2
Using Python and Pandas to calculate the actual mean temperature for each day, and then aggregating by month
seattle_data <- read.csv("https://calvin-data304.netlify.app/data/weather-with-dates.csv")seattle_data$mean_daily_temp <- (seattle_data$temp_max + seattle_data$temp_min)/2seattle_weather_json <- toJSON(seattle_data, pretty = TRUE)seattle_mean_temp <- paste0(
'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"values": ', seattle_weather_json, '},
"height": 400,
"width": 500,
"title": "Mean Temperature Per Month",
"mark": "circle",
"encoding": {
"x": {"field": "month", "type": "categorical"},
"y": {"field": "mean_daily_temp", "type": "quantitative", "aggregate": "average", "title": "Temperature (F)"},
"color": {"field": "year", "title": "Year"},
"fillOpacity": {"value": 0.8},
"size": {"value": 80}
}
}'
)
vegawidget::vegawidget(seattle_mean_temp) |> browsable()Visualizing Weather 2
Exercise 3 Create a graphic that shows how the different types of weather (rain, fog, etc.) are distributed by month in Seattle. When is it rainiest in Seattle? Sunniest?